UKON-GC
VAST Challenge 2015
Grand Challenge
Team Members
Juri Buchmüller, University of Konstanz, buchmueller@dbvis.inf.uni-konstanz.de
Fabian Fischer, University of Konstanz, fischer@dbvis.inf.uni-konstanz.de, PRIMARY
Dirk Streeb, University of Konstanz, dirk.streeb@uni-konstanz.de
Daniel A. Keim, University of Konstanz, keim@dbvis.inf.uni-konstanz.de
Student Team: no (Master Students and PhD Candidates)
Analytic Tools Used:
- Custom Prototypes for MC1 and MC2 (Python, JavaScript, HTML5, D3.js,...)
- Various Databases (PostgreSQL, Neo4j)
- KNIME, Konstanz Information Miner
- Tableau, Data Analysis and Visualization Software
Approximately how many hours were spent working on this submission in total? 40
May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete? Yes
Video Download (WMV)
Questions
GC.1 – Scott is not a paying customer and does not have an ID. Describe Scott Jones' activities in the park during the three-day weekend. Who does he spend most of his time with? When does he arrive? When does he leave? What route does he follow?
Identification of Scott
First of all we assume that the football star did not walk alone. Secondly we expect him to be present at all shows taking place on the stage. In the given data there is only one group of people going along with these two assumptions, which can be easily identified using behavior clustering (see Figure 1) and our custom visualization shown in Figure 2.
These individuals do not have any communication via the park app at any time. As security guards this seems reasonable, when they have a communication channel of their own, maybe headphones or something alike. They nonetheless have the park app with ids 44885, 1629516, 1781070, 1787551, 1080969, 1600469, 1935406, 521750 as they do not belong to the employees of the park. We think they stay with Scott at any time inside the park.
Following their route we extracted Scott's schedule for the weekend. On the way to and from the stage they follow the route shown in Figure 3, which gives many opportunities to get in contact with fans walking around in the park.
Scott Jones' Activities in the Park (Friday)
Date |
Event |
Locations |
Visualization and Comments |
Friday, 08:45 |
Scott arrives with his bodyguards in the park. |
East Entry |
The general pattern can be seen in Figure 2. The bodyguards and Scott always take the same route through the park as seen in Figure 3. |
Friday, 09:30 |
First Show |
Grinosaurus Stage |
During the show, the Creighton Pavilion is closed as seen in Figure 5. |
Friday, 11:30 |
Scott leaves the stage again. |
Grinosaurus Stage |
The general pattern can be seen in Figure 2. |
Friday, 12:15 |
Scott and his bodyguards leave the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Friday, 13:45 |
They re-enter the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Friday, 14:30 |
Second Show |
Grinosaurus Stage |
During the show, the Creighton Pavilion is closed as seen in Figure 5. |
Friday, 16:30 |
Scott leaves the stage. |
Grinosaurus Stage |
The general pattern can be seen in Figure 2. |
Friday, 17:15 |
They finally leave the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Scott Jones' Activities in the Park (Saturday)
Date |
Event |
Locations |
Visualization and Comments |
Saturday, 08:45 |
Scott arrives with his bodyguards in the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Saturday, 09:30 |
First Show |
Grinosaurus Stage |
During the show, the Creighton Pavilion is closed as seen in Figure 5. |
Saturday, 11:30 |
Scott leaves the stage again. |
Grinosaurus Stage |
The general pattern can be seen in Figure 2. |
Saturday, 12:15 |
Scott and his bodyguards leave the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Saturday, 13:45 |
They re-enter the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Saturday, 14:30 |
Second Show |
Grinosaurus Stage |
During the show, the Creighton Pavilion is closed as seen in Figure 5. |
Saturday, 16:30 |
Scott leaves the stage. |
Grinosaurus Stage |
The general pattern can be seen in Figure 2. |
Saturday, 17:15 |
They finally leave the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Scott Jones' Activities in the Park (Sunday)
Date |
Event |
Locations |
Visualization and Comments |
Sunday, 08:45 |
Scott arrives with his bodyguards in the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Sunday, 09:30 |
First Show |
Grinosaurus Stage |
During the show, the Creighton Pavilion is closed as seen in Figure 5. |
Sunday, 10:00 - 11:30 |
Possible crime |
Creighton Pavilion |
See answer to Question 3 for further explanation. |
Sunday, 11:30 |
Scott leaves the stage again. |
Grinosaurus Stage |
The general pattern can be seen in Figure 2. |
Sunday, 12:15 |
Scott and his bodyguards leave the park. |
East Entry |
The general pattern can be seen in Figure 2. |
Sunday, 15:00 |
Show is canceled |
Grinosaurus Stage |
A lot visitors still walk to the Grinosaurus Stage. However there was no show due to the crime having taken place in the morning (See figure in Answer 2.1). In Figure 2 it is easy to see that the group does not come back on Sunday in the afternoon. |
Visualizations and Figures
Figure 1 – This plot generated with KNIME shows the first two PCA-dimensions over features extracted from movement data. The security personal are the circled outliers at the top left.
Figure 2 – This image is a custom visualization representing individual visitors and groups of visitors as colored columns. Columns are colored according to activity. We selected only the cluster of bodyguards who share the same pattern for all days visiting all shows.
Figure 3 – This visualization shows the bodyguard movements for every day. Scott is most likely moving with this group. Additionally they take a very long route from the entrance on the east via all areas of the park to the stage and back. We assume they want to meet many fans on their way through the park. They do not check in anywhere else.
GC.2 – Identify up to 8 issues with park operations during the three-day weekend. Provide a rationale for your answers.
1) Visitors not properly informed about canceled show
On Sunday afternoon there was a football-show scheduled that did not take place. On the picture you can see many guests staying at the show in the upper white box in the morning (colored as light red). In the lower box in the afternoon instead there are only small dots (colored in light green). Because of the map selection, these represent people walking to the stage, turning around and walking away. Such a pattern is unique. Thus we conclude that there was a show planned, but not all park visitors were properly informed about the canceled show. Communication should be improved, because many visitors walked there and only realized there, that there was no show.
2) Late Opening / Waiting Times
As you can see in the graph below Galactosaurus Rage opened late on Friday. The horizontal axis displays time of the day, the vertical axis shows time people spent in the attraction when having checked in at x. Colors denote the three days (Friday: red, Saturday: yellow, Sunday: blue). In the early morning there are no check-ins on Friday and after that you see a peak when people start waiting for the attraction to open. The earlier one enqueues, the longer he has to wait.
3) Problems with Galactosaurus Rage
On Friday in the evening there appears to have been another issue with Galactosaurus Rage. While all other attractions are still open, showing no difference in check-ins, Galactosaurus Rage has less check-ins between 7 and 8pm.
4) Double Movements
For two visitors (1983765, 1600469) the movement tracking system did not work as intended. Taking 1983765 as an example the system recorded movement in two different places on Saturday between 20:18:21 and 20:34:36. This could have happened by accident or the system has been hacked to cover the movements of another person. This way the criminal could have hidden himself from tracking on Sunday while vandalizing the pavilion. On the image below you can see this double movement over time beginning in two places totally apart and ending at the east entrance almost at the same time.
5) Stage Fan
One guest (1690685) goes to the show on Friday 09:51:09 does not move anywhere else, checks in at 13:00:01 again and stays to 16:00:27. This is strange. Either he was in the stage area while there was no show or the movement tracking system did not work correctly and lost his movement between the two shows. In the white box below you can see him seeming to spend this much time in the show area on Friday.
6) Less thrill rides on weekend
Over the weekend the proportion of rides on Thrill rides decreases compared to Friday. The picture below shows number of check-ins as size of black dots at attractions placed according to their position in the park normalized by day. Each diagram represents one day, beginning with Friday on the left hand side. Absolute numbers however increase. Taking waiting times into account the park could provide its visitors with more opportunities to ride Thrill rides on the weekend, as current capacity is exhausted.
7) Disappearing Communicative Guest
ID: 898576 enters on Sunday and has movement data until 10:19am, checking in at Kaufs-Lost-Canyon-Escape. Afterwards he appears no more. Nonetheless he still communicates with his buddies, although there is no movement data available any more. While data is available he moves with IDs 246969, 646143 and 734180. However, from communication data we can extract that he even sends messages from different areas afterwards! In the visualization below these messages are lines in different colors beginning in the highlighted row and ending in another.
This visualization specialized for communications patterns, shows the communications between a set of IDs. The horizontal axis represent different point in times and the colored lines show who is communicating with whom in this hour. The color represents the area the communication was initiated.
8) Overnight Guest
657863 visits the park on Friday, rides Ichthyoroberts-Rapids and disappears. During park opening on Saturday, he walks at 8 am straight from Scholtz-Express to Main-Entrance leaving the park. As mentioned in the given additions, Ichthyoroberts-Rapids is a water-ride where visitors are able to go inside and watch other riders. Maybe the phone is lost or stolen and left at Scholtz-Express. The jump from Ichthyoroberts-Rapids to Scholtz-Express may be caused due to corrupted Friday-data. The image shows movements on Friday on the left and on Saturday on the right. Movement direction on Saturday goes from inside the park (left) to the north entrance (top). All other visitors start outside the park and appear at an entrance first.
The shown visitor enters the park with a group on Friday (left image) and disappears. On Saturday, the visitor suddenly appears again and leaves the park 10 minutes afterwards.
GC.3 – For the crime, describe the following, and provide your rationale: a) When did the crime occur? b) Where did the crime take place? c) Who are the most likely suspects in the crime?
a) The crime happened around Sunday, 10:00 to 11:30
This is while the show on Sunday morning is taking running. We also see a peak in communication data after the first show which is most likely related to the crime in the morning.
Figure 4 – The figure shows number of communications grouped by areas for Sunday.
After that there is no show in the afternoon although people expected a show to take place (See answer 2.1). Additionally it says that park officials "never had any problems at the Pavilion prior to this" in the news article, which corroborates the hypothesis of the crime happening at the last day. Secondly the article states that the Pavilion was "locked up tight before each show," This leaves us with a time frame between 10:00 and 11:30am on Sunday.
b) The crime happened in Creighton Pavilion
As of the news article, which states someone "vandalized a pavilion exhibiting Jones's memorabilia and made off with an Olympic medal" and "Creighton Pavilion was closed and locked up tight" we expect the vandalism to have happened in Creighton Pavilion. Looking at check-ins at the pavilion, we can see it was closed shortly after the first show on Sunday. This points to the vandalism having happened there, too. Especially as all other attractions continue their service after that time.
Figure 5 – The figure shows check-ins for the pavilion per hour for every day at the top. Below the same information is shown over a serial timeline at higher resolution.
c) Identification of possible suspects
After having found out when and where the vandalism happened and that an Olympic medal was stolen, we started looking for people who left the park immediately afterwards:
1983765, 921888, 47441, 1269018, 388355, 1336706, 812224, 1219287, 841993, 1910253, 63549, 1699779, 800878, 1584300, 2042333, 1387164, 1898843, 1752009, 213290, 196239, 919935, 973062, 649981, 1052726, 1600283, 2068577, 499292, 315807, 1405106, 2058354, 212756, 237750, 786972, 1484697, 1421406, 290906, 296930, 1288890, 1419745, 1635124
Figure 6 – The visualization shows visitors. The interactive display helps to identify those visitors leaving the park shortly after the crime.
Based on communication data we see, that most of them are somehow connected via communications:
Figure 7 – The communication network of those visitors leaving the park can be used to identify groups.
However, two persons (1269018 and 1983765) did not communicate at all and are not part of the graph above, which might be suspicious. Most reenter the park in the afternoon and many showed equal behavior the days before. Taking a closer look at those who do not enter the park another time, we are down to four. Among them we find those not communicating.
Figure 8 – The applied clustering confirms visitors with strange behavioral patterns. The colored columns show the selected time frame from 08:05 until 12:10.
Three of them have a lot of check-ins for Rides for Everyone in the time frame when the crime happened. One however (ID: 1983765) seems to be at the same place all the time and changed his behavior compared to the two days before (see Figure 6). In consequence we took an even closer look at what this guest did. That way we found out he seemed to be in two places between 20:18:21 and 20:34:36 on Saturday—a more extensive explanation of this phenomenon and an image are given in answer 2.4. Therefore we think he manipulated his device to suppress movement tracking and lock up the pavilion in the time he seems to be at the BBQ. He is one of those without communication data.
To mention another suspect person, 898576, who is already described in 2.7. He has no valid movement data in the time frame in question. Nonetheless his device is active as shown above.